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Abstract 

A  new  fast  parallel  algorithm  for  computing  the  GCD  of  many  polynomials  over  integral  domains  is 
presented.  If  the  number  of  input  polynomials  is  r  and  the  input  polynomials  have  degree  at  most 
m,  then  this  algorithm  runs  in  parallel  time  0(log^  m  +  log^  r)  using  Oim'*  ^r^  ^)  processors.  This 
matches  the  bounds  of  a  similar  work  by  von  zur  Gathen  [8]. 

Key  words.  Euclid's  algorithm,  greatest  common  divisors,  parallel  algorithms,  polynomial 
remainder  sequences,  resultant,  subresultant  chain. 
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1      Introduction 

In  this  paper,  we  suppose  that  I  is  an  integral  domain  and  F  is  its  quotient  field.  For  the  GCD 
computation  over  integral  domains  we  need  the  following  definitions.  Let  p(x),p'(x)  G  I[x].  De- 
fine the  pseudo-remainder  of  p{x)  divided  by  p'(x)  to  be  the  following  polynomial  prem(p,  p'):  if 
*^®g(p)  <  <i6g(p')  or  p'  =  0  then  prem(p,  p')  =  p.  Otherwise,  prein(p,  p')  is  the  polynomial 
r(x)  e  I[x]  such  that  there  exists  ^(x)  €  I[x], 

rix)  +  p'ix)qix)=b'+'p{x),       deg(r)  <  deg(p')  (1) 

where  b  =  lead(p'(x))  is  the  leading  coefBcient  of  p'  and  d  =  deg(p(x))  —  deg(p'(x)). 
Definition.  Let  p{x),q{x)  £  I[x].  Then  p  and  q  are  similar,  denoted  p  ~  5,  if  there  exists  a,6  6  I 
such  that 

ap{x)  =  bq{x) 

A  special  case  of  this  is  where  a  and  6  are  units,  in  which  case  p  and  q  are  said  to  be  associates  of 
each  other. 

Definition.  Let  p{x),q{x)  be  non-zero  polynomials  in  I[x]  with  deg(p)  >  deg(q).  Then  a  sequence 
(poiPii  ■•  •  ,Ph)  {h  >  1)  is  called  a  polynomial  remainder  sequence  (PRS)  of  p  and  q  ii  po  =  p,  pi  =  q 
and  for  »  =  2,  3, . . . , /», 

p,  ~  prem(pi_2,p,_i) 

and  : 

prem(pA_i,Ph)  =  0. 

Following  Collins  [5],  we  call  a  PRS  (po,Pi,  •  • .  ,Ph)  the  Euclidean  PRS  if  p,  =  prem(pi_2,  Pi-i)  for 
i  =  2, . . .  ,h.  A  GCD  of  p  and  g  is  a  polynomial  which  is  similar  to  p/,.  Thus,  the  GCD  of  p  and  q 
is  unique  up  to  similarity. 
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Euclid's  algorithm  for  computing  the  GCD  of  two  polynomials  can  suffer  from  exponential  inter- 
mediate expression  swell  if  the  coefficients  of  the  polynomials  are  integers  (Brown  [3]).  To  see  this, 
we  compare  the  Euclidean  PRS  to  the  primitive  PRS.  A  PRS  (go.  ■  ■  ■  ,qh)  is  a  primitive  PRS  if  for 

each  i  =  2, . . . ,  /j, 

prem(g,_2,g,_i) 
qi=  1^ • 

Pi- I 

where  /?,_i  is  the  content  of  prein(gi_2,  g^-i)-  Let  {po,Pi,  ■  ■  ■  ,Ph)  be  the  Euclidean  PRS  with 
Po  =  qo  and  pi  =  gi-  The  following  lemma  is  useful  (Yap  [14])  : 

Lemma  1   Let  A,B  e  l[x]  and  a,  /?  6  I.    Then 

premiaA,  03)  =  Q^*+^prem(/l,  B) 

where  6  =  deg(i4)  —  deg(fl). 

For  ejich  »'  =  1, . . . ,  /i  —  1,  let  6;  =  cleg(p,)  —  deg(p,+i).  We  know  that  P2  =  0i  ■  92-  Clearly,  from 
Lemma  1,  for  t  =  3, . . . ,  /», 

where  ^i  is  some  constant.  But  for  each  j,  6,  +  1  >  2.  Thus,  ph  =  0^  ^  ■  Qh  for  some  ^  €  I-  And 
h  =  0(n)  where  n  =  deg(po).  Thus,  if  we  use  binary  representation  to  represent  the  coefficients,  the 
coefficients  of  p/,  will  be  exponential  times  (i.e.  2*^^"))  longer  than  the  coefficients  of  g/,.  The  theory 
of  subresultants  (Collins  [6],  Brown-Traub  [4],  Ho- Yap  [9])  was  developed  to  avoid  the  difficulty  of 
the  exponential  coefficient  growth  in  the  GCD  computation.  In  this  paper,  we  extend  the  theory  of 
subresultants  from  two  polynomials  to  several  polynomials.  The  extended  theory  of  subresultants 
provides  a  new  parallel  algorithm  for  computing  the  GCD  of  many  polynomials  over  an  integral 
domain.  The  model  of  computation  we  used  in  this  paper  is  the  PRAM  ( Fortune- Wyllie  [7],  Hong 
[10])  with  arithmetic  and  tests  in  I  as  basic  operations.  Our  result  is  therefore  an  alternative  to  the 
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algorithm  of  von  zur  Gathen  [8],  achieving  the  same  parallel  time  complexity  and  the  same  processor 
bound. 

The  theory  of  subresultant  also  provides  a  method  to  construct  the  extended  Euclidean  scheme 
of  two  polynomials  over  a  field  by  solving  systems  of  linear  equations.  Borodin-von  zur  Gathen- 
Hopcroft  [2]  used  the  method  of  solving  systems  of  hnear  equations  to  construct  parallel  algorithms 
for  computing  the  GCD  of  two  polynomials.  Von  zur  Gathen  [8]  employed  this  method  in  a  parallel 
algorithm  for  the  GCD  of  many  polynomials.  If  the  input  polynomials  have  degree  at  most  n  and  the 
number  of  the  input  polynomials  is  also  n,  then,  over  Q  or  R,  von  zur  Gathen's  algorithm  runs  in 
parallel  time  Oflog^n);  however,  over  an  arbitrary  field,  his  algorithm  runs  in  probabilistic  parallel 
time  0(\og^  n).  Subsequently,  Mulmuley  [13]  gives  an  0(log  )  parallel  time  algorithm  to  compute  the 
rank  of  a  matrix  over  an  arbitrary  field.  This  immediately  implies  that  von  zur  Gathen's  algorithm 
now  runs  in  parallel  time  0(log'^  n)  over  an  arbitrary  field.  It  worth  mentioning  that  the  Subresultant 
Structure  Theorem  (Ho- Yap  [9])  can  provide  parallel  algorithms  to  compute  the  subresultant  chain, 
and  then  the  subresultant  PRS,  of  two  polynomials  over  an  integral  domain.  The  algorithms  will  be 
shown  in  Section  7. 

2     The  Extended  Euclidean  Scheme 

Let  us  consider  r  polynomials  belonging  to  I[x]  of  degree  d\,d2,  ■  ■  ■  ,dr  respectively: 

Pi{x)  =  ai^d.x''' +a,d,-ix''-~^  + |-a,,o       (i  =  1,  2, . . .  ,r). 

Suppose  m  =  inax{<i,;  1  <C<  r)  and  n  =  iuin{d^;  1  <  «  <  r}.  Suppose 

deg(GCD(pi,p2,...,Pr))  =  ft  (2) 
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Let  g{x)  =  GCD(pi,p2,...,Pr)  =  gh^''  +  gh-i^''    ^  + \- go-  Then,  there  exist  Si,  S2,  •  •  • ,  «r  €  I[x] 

such  that  (von  zur  Gathen  [8]) 


^s,p,-g       and       deg(si)  <  n. 


(3) 


1=1 


RecaUing  the  theory  of  subresultants,  we  define  the  matrix  72t  for  0  <  ib  <  n  as  follows: 


Rk  = 


ai.di       "Id,-! 

ai,d,  <'l,di-l 


ai,<i,       ('l.di-l 


1r,d.  (^r.dr-l 


OlO 


aio 


OlO 


flrO 


OrO 


QrO 


(4) 


Ir.d,       Or.dr-l 

where  there  are  m  +  n  —  di  —  k  rows  constructed  from  the  coefficients  of  p,  for  each  i  =  1,2, ...  ,r 
and  m  +  n  —  /t  columns;  the  blank  spaces  are  filled  with  zeroes. 
Definition.  It  is  natural  to  call  Rq  the  Sylvester  mainx  of  pi ,  p2, ...,  Pr- 

Let  G  =  [0,0, . . .  ,0,  gh,  gh-i,  ■  ■  ■  t  go]  he  the  (m  +  n)  — vector  formed  from  the  coefficients  of  g(x), 
padded  with  leading  zeroes.  Since,  from  (3),  deg(si)  <  n  for  «  =  1,  2, . . . ,  r,  we  suppose 


«t  =  l>i,m+n~d,-lX 


rrj  +  n  — d,  —  1 


+  l>t,m+n-d,-2X 


m+n-d,-2 


+  •■•  +  6,,o 


(5) 


Let  B  =  [6i,m  +  n-(ii-l,fcl,m  +  n-di-2.  •  •  •  ,  ^lO,  •  •  •  ybr,m  +  n~dr-l  <  f>r,m  +  n-dr-2,  ■  ■  ■  ,f>ro]-    We  Can  rewrite 

(3)  in  the  following  matrix  form: 

R^B'^  =  G"^  (6) 

3     The  Resultant  of  Several  Polynomials 

The  relation  between  the  degree  of  GCD  of  several  polynomials  and  the  rank  of  Ric  was  first  shown 
by  K.  Kakie  [11].  The  following  theorem  is  the  main  result  of  Kakie  [II]. 
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Let  K  be  a  field  and  let  K[x,i/]  denote  the  ring  of  polynomials  in  two  indeterminates  x,y.  Let 
us  consider  r  homogeneous  polynomials  belonging  to  K[x,t/]  of  degree  di,d2,  ■  ■ .  ,dr  respectively: 

P<.(x,y)  =  a„,d„x'^">  +a„,d„_ix'^»-iy+-+ac,o/°       (q  =  1,2, . . . ,  r). 

Suppose  m  =  max{cfa;  1  <  «  <  »"}  and  n  =  mm{da;  !<«<'■}•  One  can  define  Rk  as  the  form  in 
(4)  for  <:  =  0,  l,...,n- 1. 

Theorem  2  (Kakte)  The  highest  common  factor  of  the  homogeneous  polynomials  Px,P2,  ■  ■  ■  ,Pt  >s 

of  degree  k  if  and  only  if 

{rank   Rk  =  m  +  n  —  2k, 
rank   R^-i  =  rank   Rk  +  l- 

We  can  have  the  same  theorem  for  polynomials  pi,p2,  ■  •  •  ,Pr  €  l[x]-  Let  p,  =  Y2j'=o  ^tj^' ■  Define 
a  map  <i>  :  I[x]  — ►  I[x,  y]  by 

We  can  say  that  p;  is  homogenized  by  the  map  <f>.  Clearly,  <f>  is  an  isomorphism  between  l[x]  and 
re^«/ar  homogeneous  polynomials  in  I[x,y].  (A  homogeneous  polynomial  in  I[x,!/]  is  said  to  be 
regular  if  the  coefficient  of  the  highest  power  of  z  in  the  homogeneous  polynomial  is  free  of  y.)  The 
following  theorem  follows  from  the  isomorphism  of  (j)  and  Theorem  2. 

Theorem  3    The  GCD  of  the  polynomials  pi ,  P2,  •  •  • ,  Pr  *s  of  degree  h  if  and  only  if 

j  rank  /?h  =  m  +  n  -  2/j, 
\   rank  Rh-i  =  rank   /?/,  +  1. 

Theorem  4  and  Lemma  5  are  obtained  by  applying  <f>~^  on  Theorem  A  and  Lemma,  both  found  in 
Kakie  [11]. 

Theorem  4  In  order  thai  the  polynomials  pi,p2,  ■■■  ,Pr  have  a  common  factor  of  degree  >  k,  it  is 
necessary  and  sufficient  that  rank  Rk  ts  less  than  m  +  n  —  2k. 
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Lemma  5  Assume  thai  rank  Rt  is  less  than  m  +  n  —  2k.  Then  for  each  integer  a  =  0, 1, . . . ,  fc,  rank 
Rk-a  =  rank  Rt  +  a. 

Corollary  6  Suppose  the  GCD  of  the  polynomials  pi,P2.--.Pr  is  of  degree  h.  If  0  <  k  <  h,  then 
rank  Ri;=^m  +  n  —  h  —  k. 

Proof  From  Theorem  3,  we  know  that  rank  Rh-i  =  m  +  n-/i-(/i-l)  <  m  +  n-2(/i-  1).  Choose  a 
tobe(/i-l)-/t>0.  Then,  from  Lemma  5,  rank  R(h-i)-a  =  rank  R^-i+a  =  m  +  n-/i-(/i-l)  +  a. 
Thus,  7?jt  =  m  +  n- /i  - /t,        D 

Theorem  7  Let  Rk  =  m  +  n  —  h  —  k  and  k  <  h.  Then  the  GCD  of  the  polynomials  pi,p2.  •  •  •  ,Pr  «* 
of  degree  h. 

Proof  Suppose  the  GCD  is  of  degree  b'  >  h.  Then,  by  Corollary  6,  for  all  it  where  k  <  h  <  h', 
rank  Rk  =m-\-n  —  h'  —  k:^m  +  n  —  h  —  k  which  is  a  contradiction.  Suppose  the  GCD  is  of  degree 
/»'  <  h.  Then,  by  Corollary  6,  for  all  k  such  that  k  <  h',  rank  Rk  =  m  +  n-h'-k^m  +  n-h-k. 
If  ife  =  h',  by  Theorem  3,  rank  Rk=Tn  +  n-2k^Tn+n-h-k.  For  all  ib  such  that  h'  <  k  <  h,  hy 
Theorem  4,  rank  Rk  >  m  +  n  —  2k;  but  m-^n  —  h  —  k  <  m  +  n  —  2k;  thus  rank  Rk^m  +  n  —  h  —  k. 
All  these  cases  lead  to  contradictions.  Hence,  h'  =  h.        D 

Corollary  8  The  GCD  of  the  polynomials  Pi,P2.  ■  •  •  ,Pr  ««  of  degree  h  if  and  only  if  rank  Rq  = 
m  +  n  —  h. 

For  any  square  matrix  M ,  we  let  det(M)  denote  its  determinant.  Suppose  Ui,  U2,  ■  ■  ■,Uk  are  k 
(row)  /-vectors.  Then, 


M  = 


Uk 


denotes  the  k  x  /  matrix  whose  (i,j)th  entry  is  the  jth  element  oft/,-. 


FAST  PARALLEL  GCD  ALGORITHMS  7 

Definition.     Let  M  be  a  it  x  /  matrix  where  ifc   <  /.    Then  the  deiermtnant  polynomial  of  M, 
detpol(M),  is 

det(Mi)x'-*^  +  det(Mi+i)i'-*-'+, . . .  ,+det(M,_i)x  +  det(M,) 

where  M,  («  >  k)  is  the  square  submatrix  of  M  consisting  of  the  first  k  —  1  columns  and  t'th  column 
of  M.  We  call  det(Mi)  the  nominal  leading  coefficient  of  detpol(M). 

Theorem  9  Suppose  Ui,U2,  ■  ■  ■ ,  Um+n-h  be  m  +  n  —  h  linearly  independent  row-vectors  of  Rq.  Let 


A/  = 


U2 


Um+n-h 

Then  GCD  of  the  polynomials  pi,p2,  •  •  •  .Pr  and  detpol(M)  are  similar. 
Proof.  Suppose  Vi,V2, . . .  ,Vy  are  row- vectors  of  Rq  such  that 


Ro  = 


V2 


Werenamefti  ,„+n-d,-i.''i,m+n-di-2.  ••  -.tio,  •  ■  • , '>r,m+n-<i,-ii ''r,m+n-d,-2>  ■  ■  ■  -  ^ro  in  Section  2  (5) 
to  be  6i ,  621  ■  •  •  1  ^1/1  so 

B  —  [6i,m  +  n-di-l>  ^l,m  +  n-di-2i  •  •  •  ,  ^lOi  ■  •  •  >  ^r,m  +  n-d,-l  i  frr.m  +  n-d,-?  >  •  •  •  ,  frro]  =  [bl  ,b2,  ■  .  ■  ,  b„]. 


Then,  we  rewrite  (6)  in  Section  2  as  follows: 


61  Vi'r  +  62^2"^  +    ■  •  +  b,Vj^  =  G'^ 


(7) 


From  Corollary  8,   we  know  that,  for  all  V;   ^   {Ui,U2,  ■  ■  ■  ,Um+n-h},   1    <   »   <   t^.   there  exists 
Oil,  a,2, .  •  ■  ,a,- m+n-hi  A  €  I  such  that  they  are  not  all  zeroes  and 

m  +  n—h 
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Thus 


Then  (7)  can  be  transform  to 


m+n  —  h 


6.v^.^=  E 


;  =  i 


0,       ' 


tiU^  +  t2Uj  +■■■+  tr„+„-hUZ+„_f,  =  G'^ 


(8) 


Suppose  the  (m  +  n)-vector  (7,  has  the  form  [uii,"i2. 
rewrite  (8)  as  follows: 


,Ui,m+n]  for  t  =  1,2,.  . .  ,m  +  n  -  /i.    We 


"11 

"12 


U21 
«22 


We  know  that 


fl,m  +  n       ^7.m  +  n 


"11 
Ul2 


Wm+n-/>,2 


U 


m+n— fc,m+n 


<1 
<2 

^m-^n—h 


■  0 

= 

0 
9h 

.  90    . 

(9) 


"21 
"22 


"m+n-h,l 
"m+n-h,2 


9^0 


"l,m  +  n-h       "2,m  +  n-/i  "m+n-fc,m+n-h 

Otherwise,  by  working  backwards  from  (9),  (7),  (6)  and  (3),  there  would  he  s[,s'^, . . .  ,s'^  6  I[x]  such 
that  s'lPi  +  S2P2  +  •  •  ■  +  s'rPr  —  g'  and  deg(p')  <  h;  then,  the  degree  of  g'  is  less  than  the  degree  of 
GCD,  which  contradicts  (2).  For  0  <  i  <  /i,  if  j,  ^  0  then 


"II 
"12 


«21 
U22 


"l,m  +  n-h-l       "2,m  +  n-h-l 
"l,m  +  n-i  "2,m  +  n-i 


"m  +  n-h,l 
"m  +  n-h,2 

"m  +  n  — /i.m+n  — /»  — 1 
"m+n— /l.m+n—  i 


#0 
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and  then,  by  Cramer's  rule, 


"12 


«21 

«22 


^l,m  +  n-h-l       ^2,m  +  n-h-l 
Wl,m  +  n-i  ^2,m+n-i 


"11 
«12 


«21 
«22 


'^l.m  +  n-h-l       "2,m  +  n-h-l 
"l,m  +  n-i  W2,m  +  f>-i 


Ull 
"12 


U21 

"22 


"l,m  +  n-h-l        U2,m  +  n-/i-l 


"m+n-h,l 
Wm+n-/i,2 

"m+n  — h.m  +  n  — /»  — 1 
"m+n  — /i,Tn+n  — i 

Wm+n-/i-l,l 
"m+tj-h-1,2 


"m+rj-A-l,m  +  n-h-l       0 
"m+n  — /i  — l,m  +  n  — i  9i 


Um+n-h-1,1 
Um+n-/i-l,2 

"m+n— h— l.m  +  i— /i— 1 


m+n—  h 


Since  <m+n-/i  G  F,  one  can  let  tm+n-h  =  -^  for  some  a,  /3  €  I-  Let 


/? 


^  = 


"n 

"12 


U21 
«22 


"l,m  +  n-h-l       "2,m  +  n-h-l 

Thus  ^  G  I  is  a  constant.  Then,  from  (10),  we  have 


"m+n-h-1,1 
•••       "m+n-/i-l,2 

"m+n— h  —  l,m-f  n— h— 1 


"11 
"12 


"21 
"22 


"l,m  +  n-/i-l       "2,m  +  n-/i-l 
"l,m  +  n-i  "2,m  +  n-i 


"m  +  n-h,l 
"m+n-h,2 

"m+n— ft, m+n  — A  — 1 


Q  =  P-^-9i 


(10) 


(11) 


We  just  discussed  the  Ccise  of  3,  ^  0.  Now,  we  consider  the  other  case.  For  0  <  i  <  h,  if  gi  =  0  then 


"11 
"12 


"21 
U22 


"l,m  +  n-h-l       "2,m  +  n-/i-l 
"l,m  +  n-i  "2,m+n-i 

Thus,  from  (11)  and  (12),  we  conclude  that 


Wm+n-h,l 
"m+n-h,2 

"m  +  n  — h,m+n  — /»  — 1 
^m-i-n  —  h,Tn+n  —  i 


=  g,  =  0 


(12) 


Qdetpol(M)  =  p-^g{x) 


Hence,  detpol(M)  and  GCD  of  the  polynomials  Pi,P2i  ■•  •  ,Pr  are  similar.        D 
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Now  we  can  define  the  resultant  of  several  polynomials.  Let  Pi,P2,...,Pr  €  I[x]  with  m  = 
max {deg( Pi );  1  <  i  <  r}  and  n  =  inin{deg(pi);  1  <  i  <  r].  Let  fio  be  the  Sylvester  matrix  of 
P1.P2.  •  •  ■  1  Pr-  The  rows  of  Re  span  a  vector  space,  the  row  space  of  R^.  We  say  that  the  ith  row  of 
Ro  is  independent  or  dependent  according  as  it  is  linearly  independent  or  linearly  dependent  of  the 
first  t  —  1  row-vectors  of  iZo-  Clearly,  the  rank  of  7?o  is  equal  to  the  number  of  independent  rows  in 

Ro- 

Definition.  Suppose  the  rank  of  Rq  is  rj.  Let  U\,  U2,  ■■■  ,Ur,  be  the  first  17  independent  rows  of  Re 
and  f/i;+i, . .  ■ ,  Um+n  the  first  m  +  n  —  q  dependent  rows  of  Rq.  Define  the  resultant  of  pi,p2, ...,  Pr 

to  be 

U, 

U2 
res(pi,p2,...,Pr)  =  det 

From  Theorem  9,  we  immediately  get  the  following  theorem: 


Theorem  10    The  polynomials  pi,p2, . . .  ,Pr  have  an  nontnvial  common  factor  if  and  only  if 


res(pi,p2,...,Pr)  =  0 


4     The  Subresultants  of  Several  Polynomials 


Recall  the  definition  oi  Rt  in  section  2  (4).  We  define  the  subresultants  of  pi,p2, ...  ,Pr  as  follows: 
Definition.  Suppose  the  rank  of  /?*  is  77.  U  q  >  m  +  n  —  2k,  then  let  Ui,  U^,  ■  ■  ■ ,  Um+n-2k  be  the 
first  m  +  n  —  2k  independent  rows  of  i?t;  otherwise,  let  Ui,U2,  ■  ■  ■  ,Ur,  be  the  first  rj  independent 
rows  of  Ro  and  i/,,+  1, . . . ,  Um+n~2k  be  the  first  m  +  n  —  2k  —  rj  dependent  rows  of  Rjc.  Define  the 
k-th  svbresuHant  of  pi ,  p2, . . . ,  Pr  to  be 


sresi(pi,p2,...,pr)  =  detpol 


Ui 
U2 


t/, 


m+n-2i 
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Note  that  8reso(pi,P2.  •  •  •  ,Pr)  is  the  resultant  of  Pi,p2, . . .  ,Pr-   Obviously,  from  Corollary  6,  if 
the  GCD  of  pi ,  p2, . . . ,  Pr  is  of  degree  /i  >  0,  then 

8re8/,_i(pi,p2,...,pr)  =  •••  =  sre8o(pi,P2,-.-,Pr)  =0. 

Theorem  11   If  the  GCD  of  pi,p2, .  ■  ■  ,Pt  is  of  degree  h.   Then,  the  h-th  subresitltant  and  GCD  of 
Pi.P2>  ■•■,Pr  are  similar. 

Proof.  Suppose  (7  =  [ui ,  ti2, . . . ,  u^]  is  an  n-vector.  By  padding  U  with  h  zeroes  to  the  left,  we  mean 
to  construct  the  following  (n  +  /j)-vector: 


[0,0,...,0,Ui,U2.--.«n] 

^ -^.^ ' 

h 

For  1  <  i  <  r,  we  call  the  m  -\-  n  —  di  —  k  rows  of  Rk  constructed  from  the  coefficients  of  p,  the 
j'th  block  of  Rk-  We  can  construct  Rq  from  Rh  by  padding  each  rows  in  Rh  with  h  zeroes  to 
the  left  and  then  adding  h  rows  constructed  from  the  coefficients  of  pi  in  front  of  ith  block  for 
»  =  1,2, ...  ,r.  Suppose  f/j,  C/o.  ■  ■  .  f^m+n-2h  are  the  first  m  +  n  —  2/j  independent  rows  of  7?/,.  And 
suppose  U[,  t/j, . . . ,  f^m+n-2h  ^^^  obtained  by  padding  each  of  t/i,  t/2,  •  •  • ,  Um+n-2h  with  h  zeroes 
to  the  left.  Let  Vi,  V2, . . . ,  V),  be  the  first  h  row-vectors  of  Rq.  Thus  Vi,  V2, . . . ,  V),  are  constructed 
from  the  coefficients  of  pi.  Let 


M  = 


Vh 

U!, 


U'm+r.- 


2h 


ail 

— 

— 

— 

— 

0 

On 

— 

— 

— 

0 
0 
0 

0 
0 

0 

an 

0 

0 

U2 

0 

0 

0 

Um  +  n-2h 

Clearly,  Vi,V'2i-i^h  and  f^i'i  t^2' •  •  •  >  ^m+n-2h  ^^^  linearly  independent.  It  is  eeisy  to  see  that 
detpol(Af)  and  the  /»-th  subresultant  are  similar.  From  Corollary  8,  we  know  that  there  is  no  other 
row-vector  of  Rq  which  is  linearly  independent  of  Vi,  V2,  •  •  • ,  ^h-  U[,  l/j'  •  •  •  •  ^m+n-2A-    Thus,  from 
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Theorem  9,  GCD  of  the  polynomials  Pi,P2,-  ■  ■  ,Pr  and  detpol(M)  are  similar.  Hence,  the  h-th 
subresultant  and  GCD  of  pi ,  p2 ,  •  •  • ,  Pr  are  similar.        D 

5     Independent  Rows  of  a  Matrix 

Suppose  that  M  is  a.  k  x  t  matrix  and  the  rank  of  M  is  7.  We  say  that  the  I'th  row  of  M  is 
independent  or  dependent  according  as  it  is  linearly  independent  or  linearly  dependent  of  the  first 
«  —  1  row- vectors  of  M .  Clearly,  the  rank  of  M  is  equal  to  the  number  of  independent  rows  in  M. 
In  this  section,  we  present  a  fast  parallel  algorithm  to  construct  a  7  x  /  matrix  which  consists  of  all 
the  independent  rows  of  M. 


Algorithm  1   Independent  rows  of  a  matrix. 

Input.  A  k  X  I  matrix  M 

Output.  A  7  X  /  matrix  A^  which  consists  of  all  the  independent  rows  of  M,  where 
7  is  the  rank  of  M . 

1.  For  t  =  1, 2, . . . ,  Jt,  let  M{  be  a  :  X  /  matrix  which  consists  of  the  first  i  rows  of 
M.  Let  SPi ,  SP2, . . . ,  SPk  be  mutually  disjoint  sets  of  processors.  Each  SPi, 
for  J  =  1,2,  ...,it,  computes  the  rank  of  Mi  by  using  Mulmuley's  algorithm 
[13]  and  then  put  the  result  in  r[i]. 

2.  Let  r[0]  :=  0. 

3.  Each  SPi  checks  r[j  -  1]  and  r[2].  If  r[i  -  1]  ^  r[i],  then  assign  M[i],  the  tth 
row  of  M,  to  A''[r[t]]. 

4.  End  Algorithm. 
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Let  T)  =  iuax{jb,/}.  The  Mulmuley's  algorithm  [13]  runs  in  parallel  time  ©(log^rj)  by  using  0{t]*^) 
processors.  Thus,  step  1  takes  0(log^  r;)  time  using  0{t]^^)  processors.  Step  2  takes  0(1)  time  using 
one  processor.  Step  3  takes  0(1)  time  using  0{kl)  processors.  Hence,  this  algorithm  runs  in  parallel 
time  0(log^r7)  by  using  0(rj^^)  processors. 

Let  Ric  be  the  matrix  defined  in  Section  2  (4).  We  are  going  to  construct  a  matrix  which  consists 
of  all  the  independent  rows  of  R^-  Rt  has  the  following  property:  if  the  j'th  row  of  block  j/  is 
dependent,  then  for  j  =  i  +  l,i  +  2, . . .  ,m  +  n  —  d^,,  the  jth  row  of  block  i/  is  also  dependent.  By 
this  property,  we  modify  Algorithm  1  in  order  to  reduce  the  number  of  processors  used. 


Algorithm  2   Independent  rows  of  R^. 

Input.  i?o>  ttie  Sylvester  matrix  of  pi.po,  ■  ■■  ,Pr 

Output.    A  7  X  (m  +  n)  matrix  M  which  consists  of  all  the  independent  rows  of 

Rq,  where  7  is  the  rank  of  Rq. 

1.  For  i  =  2,3, ...  ,r,  let  M,  be  a  X^'_i(m  +  n  —  dj)  x  {m  -\-  n)  matrix  which 
consists  of  the  first  i  blocks  of  M .  Let  SPi ,  SP2, .  ■ . ,  SPr  be  mutually  disjoint 
sets  of  processors.  Each  SPi,  for  i  =  1,2, . . . ,  r,  computes  the  rank  of  A/,  by 
using  Mulmuley's  algorithm  [13]  and  then  put  the  result  in  rank[i]. 

2.  Let  ranifc[0]  =  0  and  ranit[l]  =  m  +  n  —  di. 

3.  For  «'  =  1,2, . . . ,  r  and  j  =  1,  2, . . .  ,m,  let  PEij  be  a  processor  and  each  PEij 
checks  rank[i  —  1]  and  ran/t[i].  If  rank[i  —  1]  ^  ran)t[i]  and  j  <  rank[i]  — 
rank\i  —  1],  then  assign  the  jth  row  of  the  j'th  block  of  Rq  to  M[ran/fc[i]  +  j]. 

4.  End  Algorithm. 
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Step  1  takes  0(log^  mr)  time  using  0{m*^r^  ^)  processors.  Step  2  takes  0(1)  time  using  one 
processor.  Step  3  takes  0(1)  time  using  O(m'^r)  processors.  Hence,  this  algorithm  runs  in  parallel 
time  0(log^  m  +  log^  r)  by  using  0(m'*^r^^)  processors. 

6      The  Algorithm 

In  this  section,  we  present  an  algorithm  to  compute  the  GCD  of  the  polynomials  pi,p2, . . . ,  Pr-  From 
Corollary  8,  we  can  compute  the  degree  of  the  GCD  of  the  polynomials  Pi,P2,  •  •  ■  ,Pr  by  computing 
the  rank  of  R^.  The  degree  of  the  GCD  is  h  =  m  +  n  —  rank  Rq.  We  can  construct  Rh,  since  we 
know  h.  Then,  we  apply  algorithm  2  to  construct  a  (m  +  n  —  2h)  x  (m  +  n  —  /i)  matrix  M  which 
consists  of  all  the  independent  rows  of  Rh-  And  then,  we  compute  detpoI(M)  which  is,  by  Theorem 
11,  the  GCD  of  the  polynomials  pi,p2,  ■  ■  ■  ,Pr-  To  compute  the  rank  oi  Rq,  we  use  the  algorithm 
in  Mulmuley  [13].  To  compute  detpol(Af ),  we  need  to  compute  the  determinants  of  matrices.  We 
apply  the  algorithm  in  Berkowitz  [1]  to  compute  the  determinant  of  a  matrix. 
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Algorithm  3   The  GCD  of  many  polynomials. 

Input  Data:  Polynomials  pi{x),p2{x), . . .  ,Pr(-i:)  6  I[x]. 

Output  Data:  The  GCD  of  the  polynomials  pi(x),p2(x), ...  ,Pr(x)- 

1.  Let  m  =  inax{deg(p^);  1  <  «  <  »"}  and  n  =  mm{deg(p,);  1  <  i  <  r}. 

2.  Construct  Rq. 

3.  Compute  rank  Rq.  Let  h  =  m  +  n—  rank  Rq. 

4.  Construct  Rh- 

5.  Apply  algorithm  2  to  construct  a  (m  +  n  —  2/i)  x  (m  +  n  —  h)  matrix  A/  which 
consists  of  all  the  independent  rows  of  Rh  ■ 

6.  Return  detpol(A/). 

7.  End  Algorithm 


Step  1  takes  O(logr)  time  using  0{r)  processors.  Step  2  takes  0(1)  time  using  O(rm^)  processors. 
Step  3  takes  0(log"  m  +  log  r)  time  using  0{m^^r^^)  processors.  Step  4  takes  0(1)  time  using 
0{rm^)  processors.  Step  5  takes  0(log  m+  log  r)  time  using  0{m'*^r^^)  processors.  In  step  6, 
we  need  to  compute  /i  +  1  determinants  of  matrices.  And,  Berkowitz's  algorithm  [1]  to  compute  the 
determinant  of  an  m  x  m  matrix  takes  parallel  time  O(log^m)  by  using  0{m^^)  processors.  Thus, 
step  6  uses  0{m*^)  processors  and  takes  parallel  time  0(log^  m).  Hence,  the  algorithm  takes  overall 
parallel  time  0(log^  m  +  log^  r)  by  using  0(m'*  *r*  *)  processors.  , 

Theorem  12  For  polynomials  over  any  integral  domain  in  one  indeterminate,  «/  the  number  of 
input  polynomials  is  r  and  the  input  polynomials  have  degree  at  most  m,  then  the  GCD  of  these 
polynomials  can  be  computed  m  parallel  time  C>(log^  m  +  log^r)  by  using  0{m'*^r^  ^)  processors. 
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Corollary  13  For  polynomials  over  any  field  in  one  indeterminate ,  tj  the  number  of  input  polyno- 
mials is  r  and  the  input  polynomials  have  degree  at  most  m,  then  the  GCD  of  these  polynomials  can 
be  computed  in  parallel  time  0(log^  m  +  log   r)  by  using  0{m^^r^  ^)  processors. 

Corollary  14  Suppose  I  »5  an  integral  domain  and  A,B€  I[x]-  Lei  deg(^)  —  m  >  deg(B).  The 
GCD  of  A  and  B  can  be  computed  m  parallel  time  0(log^  m)  by  using  0(m^  ^)  processors. 

7      Computing  the  subresultant  PRS 

Suppose  A,B  £  l[x]  and  deg(^)  =  m  >  deg(B)  =  n.  We  introduce  a  convenient  notation. 
Let  Ai{x)  €  I[x]  («  =  l,...,m)  be  polynomials  and  let  n  =  1  +  maxi<,<mdeg(-4,).  Then 
ina.t(A\,  A2, . . . ,  Am)  denotes  the  m  x  n  matrix  whose  (i,n  —  j)th  entry  is  the  coefficient  of  x-' 
in  Ai{x).  We  shall  write  detpol(^i ,  A2, . . . ,  Am)  as  short  hand  for  detpol(mat(^i , . . . ,  -4m))-  The 
following  procedure  computes  the  subresultant  chain  of  A  and  B. 


Procedure  4   The  subresultant  chain  of  A  and  B 

Input  Data:  Polynomials  A(x),B{ 

r)  e  l[x] 

with 

deg(A)  -. 

=  m  > 

deg(S)  =  n. 

Output  Data:  (S^,  S^-i, .  • .  ,So), 

the  subresultant  chain 

of  A  and  B. 

1.  Let  Sm  ■■=  A  and  Sm-i  ■=  B. 

1 

2.  For  J  =  n+ l,n  +  2,  ...,m-2, 

let  Si  := 

:0. 

3.  Let  5„  :=  lead(B)'"-"-'B. 

4.  Fort  =  0,1,...,  n-  1,  let 

Si  :=  detpol(a;"  — i>l,x"— 2. 

A,...,  A, 

j.m-t 

-15,1'"- 

-2fl, 

...,B). 

5.  End  Algorithm. 
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Step  1,  2,  3  and  4  can  be  executed  parallelly.  The  procedure  takes  parallel  time  0(log^  m)  and  uses 
0{nm*^)  processors.  Next  we  present  an  algorithm  to  compute  the  subresultant  PRS  of  A  and  B 
which  is  based  on  the  subresultant  structure  theorem  (Ho- Yap  [9]). 


Algorithm  5   The  subresu 

Itani  PRS  of 

A  and  B. 

Input  Data:  polynomials 

A{x),B{x)e 

I[x]  with  deg{A)  =  m  >  deg(B)  =  n. 

Output  Data:  (po.Pi, 

Ph),  the  subresultant  PRS  of  A  and  B. 

1. 

Call  Procedure  4  to  gc 

t  (5m,  Sm-l, 

...,5o). 

2. 

Let  mark[m]  :=  1. 

3. 

For  J  =  0,1,...,  m-  1 

,  let  PEi  be  a 

I  processor.  Each  PEi  checks  5,  and  S^+i. 

If  S,  ^  0  and  deg(5,+  i)  =  j  +  1,  th 

en  mark[i]  =  1;  else  mark[i]  =  0. 

4. 

Apply  the  prefix  algorithm  (Ladner- 

Fisher  [12])  to  compute 

mark[i] 

m 

5. 

Let  mark[m+  1]  :=  0 

6. 

Each  PEi  checks  mai 
assign  S,-  to  Pmarkli]- 

'k[i  +  1]  and 

mark[i].    If  marifc[i  +  1]  ^  marit[i],  then 

7. 

End  Algorithm. 

Clearly,  this  algorithm  takes  parallel  time  0(log^  m)  by  using  0{nm*^)  processors. 


8      Acknowledgement 


I  express  my  deep  gratitude  to  Professor  Chee  Yap  for  all  his  support  of  this  research,  his  guidance 
in  this  area  and  many  useful  comments. 


REFERENCES  18 

References 

[1]  S.J.  Berkowitz,  On  Computing  the  Determinant  in  Small  Parallel  Time  Using  a  Small  Num- 
ber of  Processors,  Inform.  Processing  Letters,  18  (1984),  pp.  147-150. 

[2]  A.  Borodin,  J.  von  zur  Gathen,  J.  Hopcroft,  Fast  Parallel  Matrix  and  GCD  Computa- 
tions, Inform,  and  Control,  52  (1982),  pp.  241-256. 

[3]  W.S.  Brown,  On  Euclid's  Algorithm  and  the  Computation  of  Polynomial  Greatest  Common 
Division,  3.  Assoc.  Comput.  Mach.,  18  (1971),  pp.  476-504. 

[4]  W.S.  Brown  and  J.F.  Traub,  On  Euclid's  Algorithm  and  the  Theory  of  SubresuHants,  J. 
Assoc.  Comput.  Mach.,  18  (1971),  pp.  505-514. 

[5]  G.E.  Collins,  Polynomial  Remainder  Sequences  and  Determinants,  Am.  Math.  Monthly,  73 
(1966),  pp.  708-712. 

[6]   G.E.  Collins,  Subresultanis  and  Reduced  Polynomial  Remainder  Sequence,  J.  Assoc.  Comput. 
Mach.,  14  (1967),  pp.  128-142. 

[7]  S.   Fortune  and  J.   Wyllie,  Parallelism  m  random  access  machine,  Proc.  Symposium  on 
Theory  of  Computing,  10  (1978),  pp.  114-118. 

[8]  J.  von  zur  Gathen,  Parallel  Algorithms  for  Algebraic  Problems,  SIAM  J.  Comput.,  13  (1983), 
pp.  802-824. 

[9]  C.  Ho  and  C.K.  Yap,  Polynomial  Remainder  Sequences  and  Theory  of  Subresultants,  Robotics 
Report  119  (1987),  Courant  Institute,  NYU. 

[10]  J.  Hong,  Computation:   Computability,  Similarity  and  Duality,  Pitman  Publishing  Limited, 
(1986),  pp.  144-147. 


REFERENCES  19 

[11]   K.  Kakie,  The  Resultant  of  Several  Homogeneous  Polynomials  m  Two  indetermtnates,  Proc. 
Amer.  Math.  Society,  34  (1976),  pp.  1-7. 

[12]   R.  Ladner  and  M.  Fisher,  Parallel  Prefix  Computation,  J.  Assoc.  Comput.  Mach.,  27  (1980), 
pp.  831-838. 

[13]   K.  MULMULEY,  A  Fast  Parallel  Algorithm  to  Compute  the  Rank  of  a  Matrix  over  an  Arbitrary 
Field,  J.  Assoc.  Comput.  Mach.,  23  (1986),  pp.  338-339. 

[14]  C.K.  Yap,  Lecture  Notes  on  Symbolic  Computation,  Class  notes,  Courant  Institute,  NYU,  Fall 
1985. 


NYU  COMPSCI  TR-352      c.l 

Ho,  Chung-jen 

Fast  parallel  GCD 
algorithms  for  several 
polynomials  over  integral 


[-   NYU  COMPSCI  TR-352      c.  1  -1 
Ho,  Chung-jen 

Fast  parallel  GCD  — 

algorithms  for  several 
polynomials  over  integral  =4. 


^i/A)'  In 


This  bool:  may  be  kept     ~  "'  >*       I   <  ?    I^RQ 

FOURTEEN    DAYS 

A  fine  will  be  charEcd  for  each  day  the  book  is  kept  overtime. 


CA  YLORD    142 


